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1. INTRODUCTION 

The topic of security is frequently overlooked by the majority of individuals, despite the potential 
risk of losing important possessions. The door holds significant importance within a building, as it serves a 
crucial function in facilitating access to various areas or confidential components of the structure. It is 
imperative that the door be equipped with a dependable security system. In a broad sense, the process of 
manually operating doors can be mechanized to enhance a range of human endeavors while also 
incorporating a built-in security mechanism. The primary objective of an office security system incorporating 
a smart door is to mitigate potential losses, namely those pertaining to the misplacement or compromise of 
critical papers and valuable data. In addition to this, the advantages of employing a smart door system for 
organizations that encompass a diverse workforce with varying roles would manifest in its exclusive 
accessibility to authorized individuals. Contemporary technology continues to rely on fingerprint 
authentication, a method that presents several challenges. Notably, its applicability is compromised during a 
pandemic due to the potential for hands to serve as vectors for viral transmission. Moreover, the prevalence 
of reading failures is a common occurrence due to unclean finger scanning surfaces. Additionally, 
fingerprint-based systems are susceptible to duplication, necessitate data input on individual machines, and 
lack remote monitoring and control capabilities [1], [2]. 
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In office settings, particularly within the banking sector, the consequences resulting from inadequate 
physical access security, such as the door mentioned, encompass significant financial losses stemming from 
the theft of assets or currency, as well as losses arising from unauthorized access to sensitive data or 
information. In the banking sector, it is imperative to exercise caution when handling data, ensuring that only 
authorized staff are granted access and use privileges [3], [4]. The local government of Odo-Otin in Osun 
State reported an incident involving the perpetration of armed robbery by a group of individuals donning 
hooded robes, targeting two banks. The attempted intrusion into the bank by the robbers was thwarted due to 
the presence of a security door at the entry. However, it is worth noting that the criminals successfully 
managed to extract the funds from the ATM machine situated outside the bank premises [5]. 

Previous studies have demonstrated the successful performance of a smart door system including 
facial recognition technology, with an accuracy rate of 94% [6]. The current stage of development for the 
smart door system outlined in this research involves a prototype that utilizes a solenoid lock mechanism. 
However, a limitation of this implementation is the inability to integrate a buzzer that would provide an 
indication in the event of an imperfect lock. The algorithm employed in this work does not incorporate deep 
learning techniques, and the research conducted does not make use of the internet of things (IoT). 

Additionally, there exists an academic study that has developed a sophisticated door system 
incorporating an alarm and an automated locking mechanism that can be conveniently activated using an 
android application. The alarm function will be triggered at the opening of the door, whereas the automatic 
locking mechanism will be engaged after a few minutes of the door being closed. The android application 
does not display the current status of the door, indicating that it is either locked or inaccessible. Additionally, 
the program allows for remote locking of the door [7]. This study uses a solenoid key as a prototype and does 
not incorporate face recognition. It utilizes IoT technology to facilitate the control of door opening and 
closing exclusively through an android application. The authors aim to develop a smart door system for 
access security doors by using the convolutional neural network (CNN) approach, as shown by the provided 
background information. The primary contributions of our proposed approach can be outlined as follows: 
firstly, the development of a method for processing image datasets in small quantities; secondly, the 
applicability of this approach in the domain of building security systems, as it effectively reduces facial 
recognition errors when identifying unfamiliar individuals; and finally, the adaptability of the approach to 
devices with constrained computational capabilities, such as the Raspberry Pi. 


2. RELATED EXISTING REVIEW 

The significance of single-board computers, such as the Raspberry Pi, has been amplified with the 
advancement of technology in the realm of the IoT. Previous research has demonstrated successful utilization 
of the Raspberry Pi in many applications, such as the development of smart door systems, including facial 
recognition technology. These studies have achieved commendable results, with a minimum accuracy rate of 
94% being attained. Facial biometrics are effectively employed in smart door systems to restrict unauthorized 
access, enhancing security measures and mitigating the risk of intrusion or tampering [8]-[14]. The door 
unlocking system's design incorporates face recognition technology through the utilization of the "Doorlock" 
application loaded on a smartphone, GSM module functionality on the smartphone, and an active raspberry 
device. The "Doorlock" program is designed to establish an automatic connection with the system and is 
capable of receiving commands from the user [15]-[17]. 

Data preprocessing is the process of preparing data for processing using machine learning or deep 
learning. Before heading to the processing stage, the raw data will be processed first. Some common 
preprocessing techniques are as follows: grayscale conversion, edge detection, gaussian smoothing, contrast 
enhancement, binarization, facial cropping, image resizing, and face alignment [18]-[21]. Image 
preprocessing shortens the processing time and increases the likelihood of a perfect match. Face images are 
preprocessed to meet feature extraction requirements [22]-[25]. 

The MobileNet V2 architecture is designed for use in mobile applications as well as computer 
devices by utilizing the Tensorflow library. MobileNet V2 is a development of the previous version, which 
also used depthwise separable convolution (DSP) techniques, namely depthwise convolution (DW) and 
pointwise convolution (PW). The difference in MobileNet V2 compared to the previous version is the 
addition of bottleneck and shortcut connection features [26], [27]. Currently, there are many studies using 
MobileNet V2, because this model gives better results with smaller parameters when using transfer learning 
from MobileNet V2 when compared to regular CNN [28]-[30]. In several studies also compared the VGG16 
and MobileNet V2, and MobileNet V2 transfer learning methods had better accuracy [31]. 

The classic classification paradigm has a closed set configuration, with training and testing classes 
drawn from the same set (objects to be recognized). This can result in overly confident predictions of a 
known class, so if the model comes across data from an unknown category, it will recognize it as a known 
category. The use of an open set is advocated in order to retain classification performance on known classes 
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while rejecting new ones [32]. Deep neural networks meet object classes that were unknown during training 
when doing open-set recognition. Existing open-set classifiers differentiate between known and unknown 
classes by measuring distance in the logit space of a network, assuming that known classes cluster closer to 
the training data than unknown classes [33]. 


3. METHOD 

The system architecture incorporates Raspberry Pi hardware and the Raspbian operating system, 
along with several peripheral devices. Notably, a web camera is employed to execute facial recognition tasks, 
leveraging a Python script developed with the utilization of the OpenCV library. When the Raspberry Pi 
detects a recognized face, it utilizes the general purpose input output (GPIO) to transmit a signal to the relay, 
resulting in the activation of the magnetic lock door after a 2-second delay. In order to facilitate manual 
unlocking from within the room, the installation of a button door release and an emergency break glass 
mechanism has been implemented to cater to emergency situations. The utilization of the Thingsboard 
community edition enables the logging of facial recognition activity in real time to the cloud. This 
functionality further facilitates the online monitoring of devices and the execution of remote-control 
operations by users. We can see the design of the hardware in Figure 1, wairing from Raspberry Pi connected 
to relay (ground, electrical voltage 5 volt, and data) and from relay (normaly closed (NC) port in relay) is 
connected to door EM lock and from relay (COM port) is connected to ground om power supply 12 volt. All 
of IP camera is connected to Raspberry Pi to capture face image and Raspberry Pi connected to Thingsboard 
server via WiFi router. 


== Electrical Voltage DC 5 Volt (For Raspi to Relay) or 12 Volt (From PS 12V to Door EM Lock) 
Data to Relay (For Raspi to Relay) or Data to Door EM Lock 
=== Ground (For Raspi to Relay) or Ground (From Relay to PS 12V) 


Figure 1. Hardware design 


We can see the design of the software in the Figure 2. Access logs record the user and confidence 
level of facial recognition, as well as the time stamp of entering the room and the length of time activity in 
the room. The condition of the door lock is visible in a green light, if the green light is lit marks the doorlock 
in the locking position. There are two buttons respectively functioning "Buka Pintu" to open the lock while 
"Kunci Pintu" to lock the door of the room. Communication between the GPIO Raspberry Pi and the 
Thingsboard Server using the message queuing telemetry transport (MQTT) protocol. 

The dataset utilized in this study comprises three distinct types of data. The initial dataset comprises 
five distinct classes, with each class representing a face to be recognized. The subsequent dataset consists of 
six classes, encompassing five classes representing recognized faces and one class representing an 
unidentified face. The objective of this dataset is to facilitate the training of the model to generalize facial 
recognition beyond the specific five faces of interest. The third dataset employs the same data as the second 
dataset, but with the additional step of background image removal. This modification aims to direct the 
model's attention towards learning patterns from the crucial facial components. 

The amount of data used will be made in 2 (two) variations, namely 500 and 800 class images, while 
for unrecognized face classes a 1:6 ratio will be used. For comparison of training (Train) data and test (Val) 
data, 2 (two) variations will be made, namely 80:20 and 90:10. The hyperparameter tuning that is carried out 
is on the number of epochs. While the learning rate (LR) used is 0.0001, the batch size (BS) is 32, random 
state RS) is 42, and the number of epochs to be used is 30, 40, and 50. To choose the best model will be seen 
from the accuracy and also the loss. The deep learning method used in this study is CNN, which utilizes 
transfer learning from MobileNet V2. 
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4. RESULTS AND DISCUSSION 

In our study, we explored the impact of various hyperparameters on the model's performance. 
Table 1 presents the results of this exploration, where we trained the model with five distinct classes. These 
findings are crucial in understanding how different settings influence accuracy and loss, offering insights into 


optimal model configuration. 


EEEE, 


Figure 2. Software design 
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Table 1. Results of training with five classes 
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Dataset Train Val LR Epoch BS RS Accuracy Loss 
500 80 20 0.0001 30 32 42 0.9010 0.2911 
80 20 0.0001 40 32 42 0.9340 0.2239 
80 20 0.0001 50 32 42 0.9185 0.2564 
90 10 0.0001 30 32 42 0.9102 0.2806 
90 10 0.0001 40 32 42 0.9471 0.1748 
90 10 0.0001 50 32 42 0.9262 0.2274 
800 80 20 0.0001 30 32 42 0.9241 0.2408 
80 20 0.0001 40 32 42 0.9606 0.1470 
80 20 0.0001 50 32 42 0.9722 0.0977 
90 10 0.0001 30 32 42 0.9650 0.1173 
90 10 0.0001 40 32 42 0.9231 0.2460 
90 10 0.0001 50 32 42 0.9361 0.1976 
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Our research extended to evaluating the model with an additional class, making it six in total. Table 2 
illustrates the outcomes of this extended training. This table helps in comparing the model's performance 
under different class configurations and provides a clear view of scalability and adaptability of our approach. 


Table 2. Results of training with six classes 


Known Unknown Train Val LR Epoch BS RS Accuracy Loss 
5 Class, 3,000 80 20 0.0001 30 32 42 0.9407 0.1793 
500/Class 80 20 0.0001 40 32 42 0.9482 0.1649 
2500 80 20 0.0001 50 32 42 0.9555 0.1396 
Total 90 10 0.0001 30 32 42 0.9343 0.1968 

90 10 0.0001 40 32 42 0.9465 0.1713 
90 10 0.0001 50 32 42 0.9560 0.1341 
5 Class, 4,800 80 20 0.0001 30 32 42 0.9491 0.1610 
800/Class 80 20 0.0001 40 32 42 0.9568 0.1258 
4000 80 20 0.0001 50 32 42 0.9680 0.1005 
Total 90 10 0.0001 30 32 42 0.9564 0.1372 
90 10 0.0001 40 32 42 0.9644 0.1115 
90 10 0.0001 50 32 42 0.9693 0.0951 


An important aspect of our research was to assess the effect of background noise on model 
accuracy. In Table 3, we present the results obtained from training the model with six classes, but without 
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any background. These results underscore the importance of background in model training and its impact on 
the effectiveness of face recognition. 


Table 3. Results of training with six classes and without background 


Known Unknown Train Val LR Epoch BS RS Accuracy Loss 
5 Class, 3,000 80 20 0.0001 30 32 42 0.9411 0.1872 
500 per 80 20 0.0001 40 32 42 0.9441 0.1732 
Class 2500 80 20 0.0001 50 32 42 0.9636 0.1160 
Total 90 10 0.0001 30 32 42 0.9487 0.1786 
90 10 0.0001 40 32 42 0.9535 0.1513 
90 10 0.0001 50 32 42 0.9606 0.1387 
5 Class, 4,800 80 20 0.0001 30 32 42 0.9615 0.1301 
800 per 80 20 0.0001 40 32 42 0.9652 0.1121 
Class 4000 80 20 0.0001 50 32 42 0.9729 0.0900 
Total 90 10 0.0001 30 32 42 0.9601 0.1241 
90 10 0.0001 40 32 42 0.9697 0.0996 
90 10 0.0001 50 32 42 0.9675 0.1027 


It can be seen from the Tables 1-3 that the results of the highest accuracy and the smallest loss are in 
the model with six classes and without background with 80% training data and 20% testing with epoch 50, 
batch size 32, and a learning rate of 0.0001 in Table 3. In Figure 3, which is shown the accuracy and loss 
metrics for training outcomes are shown graphically for three different cases: 5 classes, 6 classes, and 6 
classes without background. The evaluation of the system as a whole was conducted in three distinct stages. 
In the initial phase, a sample size of 20 individuals was used to evaluate the performance of facial recognition 
technology. This sample consisted of 5 individuals who were registered in the system with recognizable 
labels, 5 individuals who were registered in the system with unknown labels, and 10 individuals whose facial 
data had not been previously included in the dataset. Among the sample of ten individuals, it was observed 
that two individuals exhibited facial similarities, which can be attributed to their close familial relationship as 
siblings. This observation was made in order to evaluate the efficacy of the implemented method. 
Table 4 is pivotal in showcasing the practical application of our system. It provides the results of the face 
recognition system in real-world scenarios, emphasizing the system’s ability to distinguish between 
registered and unregistered faces, which is vital for security applications. 
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Figure 3. Accuracy dan loss dataset graphics 800 80:20 50 epoch 

The evaluation of real-time face recognition involves the implementation of direct recognition for a 
duration of three minutes. Subsequently, the system classifies the recognition outcome by calculating the 
average of the most recent 20 frames of facial images belonging to the same class. This process applies to 
recognized classes ranging from class 0 to class 4. If the mean value is equal to or more than 0.90, or if the 
mean value is less than 0.90, it will be classified as unrecognized. 

The previously mentioned test reveals the occurrence of a single failure in the process of facial 
recognition. The classification of this event as a failure stems from the examination of the facial recognition 
logs, which revealed that out of a total of 150 records pertaining to sample number 2, 74 records (equivalent 
to 49% of the total) were classified as recognized, while 76 records (equivalent to 51% of the total) were 
classified as unrecognized. In the case of sample number 3, the system successfully identified it. However, 
upon analyzing the log data of facial recognition results, it was observed that out of a total of 143 records for 
sample number 3, 87 records (61% of the total) were classified as recognized, while 56 records (39% of the 
total) fell into the unrecognized category. To demonstrate the system’s responsiveness, we conducted a 
no-touch push button test. The results, as shown in Table 5, indicate the system's reliability in responding to 
automated triggers, a feature essential for touchless security systems. 


Table 4. Face recognition system result 
System proccess 


Recorded Face recognized Door lock opened Access recorded in clouds ; 
Nọ in dataset Expected Expected Expected Explanation 
Test result Test result Test result 
result result result 
1 Yes Recognized Recognized Opened Opened Recorded Recorded Class 0 
2 Yes Not recognized Recognized Notopened Opened Not recorded Recorded Class 1 
3 Yes Recognized Recognized Opened Opened Recorded Recorded Class 2 
4 Yes Recognized Recognized Opened. Opened Recorded Recorded Class 3 
5 Yes Recognized Recognized Opened Opened Recorded Recorded Class 4 
6 Yes Not recognized Not Not opened Not Not Not Class 5 
recognized opened recognized recognized unknown 
7 Yes Not recognized Not Not opened Not Not Not Class 5 
recognized opened recognized recognized unknown 
8 Yes Not recognized Not Not opened Not Not Not Class 5 
recognized opened recognized recognized unknown 
9 Yes Not recognized Not Not opened Not Not Not Class 5 
recognized opened recognized recognized unknown 
10 Yes Not recognized Not Not opened Not Not Not Class 5 
recognized opened recognized recognized unknown 
11 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recognized 
12 No Not recognized Not Not Not Not Not Not in dataset 
recognized Opened opened recognized recognized 
13 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recognized 
14 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recognized 
15 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recognized 
16 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recorded 
17 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recognized 
18 No Not recognized Not Not opened Not Not Not Not in dataset 
recognized opened recognized recognized 
19 No Not recognized Not Not opened Not Not recorded Not Siblings of rec 
recognized opened recognized number 3 
20 No Not recognized Not Not opened Not Not recorded Not Siblings of rec 
recognized opened recognized number 4 


Table 5. No-touch push button test result 
System proccess 


No push button Access recorded in clouds 
No __ Test result Expected result Test result Expected result 
1 Opened Opened Recorded Recorded 
2 Opened Opened Recorded Recorded 
3 Opened Opened Recorded. Recorded 
4 Opened Opened Recorded Recorded 
5 Opened Opened Recorded Recorded 
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Testing in stage 3 tests the platform on the cloud server with one try to see the performance of the 
platform used. Cloud-based operations of our system were also rigorously tested. In Table 6, we present the 
results from these platform tests. This table highlights the system’s efficiency and reliability when operating 
in a cloud-based environment, which is critical for scalability and remote access. 


Table 6. Platform test result 
System proccess 


$ : Monitoring door Open door lock from Lock door from 
No Registered Poginte Clouds condition i Clouds Clouds 
1 system Test Expected Test Expected Test Expected Test Expected 
result result result result result result result result 
1 Yes Success Success Success Success Success Success Success Success 


To analyze the test results, the confusion matrix is used so that the accuracy of the system can be 
known. Assessing the accuracy of our facial recognition system is crucial. Table 7 features the confusion 
matrix, a standard tool in machine learning for evaluating classification models. This table offers an in-depth 
look at the precision and recall rates of our system, vital for understanding its practical effectiveness. 


Table 7. Confusion matrix 
Test result 
TRUE FALSE 


Prediction TRUE 26 3 
FALSE 0 45 
TP 26 26 
Recall = (TP+FN)  (26+0) 26 1 (1) 
Precision = —-— = —— = % = 0.897 (2) 
(TP+FP)  (26+3) 29 
Mor E= (TP+TN) (6+5) 710.96 (3) 


(TP+TN+FP+FN)  (26+45+3+0) 74 


Where TP is true positive, TN is true negative, FP is false positive, and FN is false negative. 
Considering the system's primary objective of functioning as a security system, it is imperative to minimize 
instances of false negatives. Hence, recollection serves as the primary metric employed for assessment. The 
results of the confusion matrix computation indicate that the current system has a high level of accuracy and 
successfully reduces the occurrence of false negatives because of its excellent recall values. Compared to the 
accuracy obtained in the study entitled "smart door system using face recognition based on Raspberry Pi" by 
Azmi et al. [6], our study yielded better results than the accuracy and recall that were more suitable for 
security doors. 

Comparing the results of "Face Recognition-based Door Unlocking System using Raspberry Pi" by 
Vamsi and Sai [11]. Our research can be implemented on more than one door in a building and also provide a 
more complete and user-friendly report using the Thingsboard IoT platform. Comparing the results of real- 
time face mask detection using the same algorithm as MobilenetV2 in the study "Real-Time Face Mask 
Detection Using Mobilenetv2 Algorithm" by Kanna and Kumar [27]. Our research provides better accuracy, 
precision, and recall. 


5. CONCLUSION 

The study determined that among the various models examined, MobileNet V2 with an input image 
size of 128 pixels, a pool size of 4,4, a density of 256, a dropout of 0.5, a learning rate of 0.0001, a batch size 
of 32, 800 images per class for the datasets, 4800 images for the unknown class, and an 80:20 split for 
training and testing datasets exhibited the highest level of optimality with respect to the dataset used in the 
study. This model achieved an accuracy of 0.9729 and a loss of 0.09. The present study has successfully 
developed a smart door system that uses the CNN technique for face recognition in order to enhance the 
security of access doors. 
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This system is capable of automatically locking and unlocking doors upon successful face 
recognition and can also transmit access logs and door lock status to a cloud server. Based on the 
comprehensive system testing conducted across three distinct stages, namely face recognition testing, no 
touch button testing, and cloud server testing, it can be deduced that the system exhibits satisfactory 
performance. The system demonstrates a commendable accuracy level of 0.96, ensuring a high degree of 
precision in its operations. Additionally, the system achieves a recall rate of 1.00, indicating its ability to 
accurately identify and retrieve relevant information. Moreover, the system exhibits a precision value of 
0.897, further affirming its capability to effectively discern and deliver accurate results. 
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